Chapter 5: Greedy Algorithms
-
Greedy Algorithms
-
Chooses action that provides most immediate benefit.
-
Minimum Spanning Trees
-
Removing a cycle edge will not disconnect a graph.
-
All spanning trees of graph \(G = (V, E) \) have \(|V|-1\) edges.
-
Kruskal's Algorithm: repeatedly add lightest edge that doesn't create a cycle.
Reverse Kruskal's: repeatedly remove largest edge that doesn't disconnect the graph.
-
Cut Property: If edge \(e\) is the lightest edge across cut \((S, V-S)\), then \(e\) is a
part of some MST.
-
Cycle Property: If edge \(e\) had weight larger than the sum of the other edges in a cycle
it's part of, it cannot be part of an MST.
-
Prim's Algorithm: start with empty set of edges, repeatedly find cuts and add the lighest edge
across the cut.
A straightforward way of obtaining cuts is 'growing' a tree \(T\). The cut is then \((T, V-T)\).
-
Union-Find
-
makeset(x) - make a singleton set of vertex x.
-
find(x) - find root of x's component.
-
union(x, y) - merge x and y's components.
-
For any x, the rank of x's parent is larger than the rank of x.
-
For any node of rank \(k\) there are at least \(2^k\) nodes in its subtree.
-
If there are \(n\) elements, there can be at most \(\frac{n}{2^k}\) elements of rank \(k\).
-
Path compression: during find(x), make root the parent of all nodes on path from x to root.
-
Huffman Encoding
-
Make each character into a node containing its frequency, heapify nodes.
-
Cost of tree: \(\sum \limits_{i=1}^n f_i \cdot (\text{depth of }i\text{th symbol in tree)}\).
-
Entropy: number of bits needed to encode sequence.
\(\sum \limits_{i=1}^n p_i \log(\frac{1}{p_i})\)
-
more compressible = less random = more predictable
-
If some character occurs with frequency more than \(\frac{2}{5}\), there is always a code of length
1.
If all characters occur with frequency less than \(\frac{1}{3}\), there is no code of length 1.
Proof from Disc 5
Fa18
-
Horn Formulas
-
Consists of implications and negative clauses
-
implications: conjunctions of literals implying another literal.
e.g. \((z \wedge x) \Longrightarrow y\)
-
negative clauses: disjunctions of negative literals.
e.g. \((\bar x \vee \bar y)\)
-
satisfying assignment: a set of assignments to the literals that satisfy all the clauses.
-
Greedy Algorithm: start with all false, for all literals, set right hand side to true if left is
true.
If assignment satisfies negative clauses, return assignment, else return false.
-
Algorithm assigns least number of variables to true / variables that algorithm sets to true must be
true in any satisfying assignment.
Proof: algorithm only assigns variables to true if LHS of implication is true.
-
Exists linear time algorithm.
-
Distance from Optimality
-
Example: Set Cover
-
Greedy: repeatedly choose set \(S\) with largest number of uncovered elements.
-
Optimal uses \(k\) sets, so after any number of iterations, the remaining vertices can be covered
with \(k\) sets.
-
There is a set of at least \(n/k\) vertices, \(n = \) number of remaining vertices. Hence greedy
leaves at
most \(n(1-\frac{1}{k})\) vertices.
-
Since \(1-x \leq e^{-x}\), \(n_t \leq n_0(1-\frac{1}{k})^t \leq n_0(e^{-1/k})^t = n_0e^{-t/k}\)
At \(t = k\ln n\), \(n_t < 1\). Hence greedy uses at most \(k \ln n\) sets.
-
Exchange Argument
-
Assume exists optimal solution with lower cost.
-
Since lower cost, optimal solution \(O\) and greedy solution \(A\) must differ.
-
Change \(O\) to make it more similar to \(A\) show that the change does not make the solution worse.
-
Since any exchanges do not make \(O\) worse, there is no optimal solution with lower cost.